Picture for Meng Han

Meng Han

TriLens: Per-Layer Logit-Lens Entropy for White-Box Hallucination Detection

Add code
May 31, 2026
Viaarxiv icon

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Add code
May 26, 2026
Viaarxiv icon

Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs

Add code
May 26, 2026
Viaarxiv icon

Composition Collapse: Stable Factual Knowledge Does Not Imply Compositional Reasoning

Add code
May 26, 2026
Viaarxiv icon

The Attribution Blind Spot: Detecting When Language Models Rely on Memory Rather Than Retrieved Context

Add code
May 26, 2026
Viaarxiv icon

Silencing the Guardrails: Inference-Time Jailbreaking via Dynamic Contextual Representation Ablation

Add code
Apr 09, 2026
Viaarxiv icon

From Retinal Evidence to Safe Decisions: RETINA-SAFE and ECRT for Hallucination Risk Triage in Medical LLMs

Add code
Apr 07, 2026
Viaarxiv icon

AttnDiff: Attention-based Differential Fingerprinting for Large Language Models

Add code
Apr 07, 2026
Viaarxiv icon

MO-RiskVAE: A Multi-Omics Variational Autoencoder for Survival Risk Modeling in Multiple MyelomaMO-RiskVAE

Add code
Apr 07, 2026
Viaarxiv icon

LatentAudit: Real-Time White-Box Faithfulness Monitoring for Retrieval-Augmented Generation with Verifiable Deployment

Add code
Apr 07, 2026
Viaarxiv icon